Scientific Python antipatterns advent calendar day seven
For today, a simple error that it easy to fall into. As a reminder, I’ll post one tiny example per day with the intention that they should only take a couple of minutes to read.
If you want to read them all but can’t be bothered checking this website each day, sign up for the mailing list:
and I’ll send a single email at the end with links to them all.
Iterating over dictionaries without using items
In Python we use dictionaries when we want to be able to efficiently look up the value associated with a key:
# create a dictionaryemails = {'Martin' : 'martin@pythonforbiologists.com', # only this one is real!'Jess' : 'jessica.rodriguez@gmail.com','Phil' : 'p.smith@outlook.com'}# look up an emailemails['Jess']
'jessica.rodriguez@gmail.com'
Sometimes we need to do something to all the key/value pairs in a dictionary. If we’re not sure how to do this, the obvious thing to try is iterating over the dictionary:
for x in emails:print(x)
Martin
Jess
Phil
From the output, we will quickly realise that iterating over a dictionary gives us the keys, so we can add a step to get the matching values:
for name in emails: email_address = emails[name]print(name, email_address, sep='\t')
Martin martin@pythonforbiologists.com
Jess jessica.rodriguez@gmail.com
Phil p.smith@outlook.com
This works, but is harder than it needs to be. Whenever you see this pattern - a loop that iterates over keys, then inside the loop a line that gets the matching value - we can replace it with a call to items:
The items method gives us a list where each elemnt is a tuple containing the key and the value. So we can set up the loop in a single line:
for name, email_address in emails.items():print(name, email_address, sep='\t')
Martin martin@pythonforbiologists.com
Jess jessica.rodriguez@gmail.com
Phil p.smith@outlook.com
which normally makes the body of the loop clearer. Occasionally you might see this:
for item in emails.items(): name, email_address = itemprint(name, email_address, sep='\t')
Martin martin@pythonforbiologists.com
Jess jessica.rodriguez@gmail.com
Phil p.smith@outlook.com
or even this:
for item in emails.items(): name = item[0] email_address = item[1]print(name, email_address, sep='\t')
Martin martin@pythonforbiologists.com
Jess jessica.rodriguez@gmail.com
Phil p.smith@outlook.com
but both are more clumly than doing the unpacking in a loop.
Bonus: it’s only necessary to use a dictionary if we want to be able to loop up individual keys. If we find ourselves building a dictionary, but only ever using it in a loop, then it should probably be a list of tuples instead:
emails = [ ('Martin','martin@pythonforbiologists.com'), # only this one is real! ('Jess','jessica.rodriguez@gmail.com'), ('Phil','p.smith@outlook.com')]
which will be even easier to iterate over:
for name, email_address in emails:print(name, email_address, sep='\t')
Martin martin@pythonforbiologists.com
Jess jessica.rodriguez@gmail.com
Phil p.smith@outlook.com
and much more memory efficient.
One more time; if you want to see the rest of these little write-ups, sign up for the mailing list: